Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
A<sc>bstract</sc> The Energy Mover’s Distance (EMD) has seen use in collider physics as a metric between events and as a geometric method of defining infrared and collinear safe observables. Recently, the Spectral Energy Mover’s Distance (SEMD) has been proposed as a more analytically tractable alternative to the EMD. In this work, we obtain a closed-form expression for the Riemannian-likep= 2 SEMD metric between events, eliminating the need to numerically solve an optimal transport problem. Additionally, we show how the SEMD can be used to define event and jet shape observables by minimizing the distance between events and parameterized energy flows (similar to the EMD), and we obtain closed-form expressions for several of these observables. We also present the Specter framework, an efficient and highly parallelized implementation of the SEMD metric and SEMD-derived shape observables as an analogue of the previously-introduced Shaper for EMD-based computations. We demonstrate that computing the SEMD with Specter can be up to a thousand times faster than computing the EMD with standard optimal transport libraries.more » « lessFree, publicly-accessible full text available December 1, 2026
-
We present the first study of anti-isolated Upsilon decays to two muons ( ) in proton-proton collisions at the Large Hadron Collider. Using a machine learning (ML)-based anomaly detection strategy, we “rediscover” the in 13 TeV CMS Open Data from 2016, despite overwhelming anti-isolated backgrounds. We elevate the signal significance to using these methods, starting from using the dimuon mass spectrum alone. Moreover, we demonstrate improved sensitivity from using an ML-based estimate of the multifeature likelihood compared to traditional “cut-and-count” methods. This is the first ever detection of anti-isolated Upsilons, which can be useful in the study of heavy-flavor fragmentation in quantum chromodynamics. Our Letter demonstrates that it is possible and practical to find real signals in experimental collider data using ML-based anomaly detection, and we distill a readily accessible benchmark dataset from the CMS Open Data to facilitate future anomaly detection developments. Published by the American Physical Society2025more » « lessFree, publicly-accessible full text available July 1, 2026
-
Deconvolving (“unfolding”) detector distortions is a critical step in the comparison of cross-section measurements with theoretical predictions in particle and nuclear physics. However, most existing approaches require histogram binning while many theoretical predictions are at the level of statistical moments. We develop a new approach to directly unfold distribution moments as a function of another observable without having to first discretize the data. Our moment unfolding technique uses machine learning and is inspired by Boltzmann weight factors and generative adversarial networks (GANs). We demonstrate the performance of this approach using jet substructure measurements in collider physics. With this illustrative example, we find that our moment unfolding protocol is more precise than bin-based approaches and is as or more precise than completely unbinned methods. Published by the American Physical Society2024more » « lessFree, publicly-accessible full text available December 1, 2025
-
Many machine learning applications involve learning a latent representation of data, which is often high-dimensional and difficult to directly interpret. In this work, we propose “moment pooling,” a natural extension of deep sets networks which drastically decreases the latent space dimensionality of these networks while maintaining or even improving performance. Moment pooling generalizes the summation in deep sets to arbitrary multivariate moments, which enables the model to achieve a much higher effective latent dimensionality for a fixed learned latent space dimension. We demonstrate moment pooling on the collider physics task of quark/gluon jet classification by extending energy flow networks (EFNs) to moment EFNs. We find that moment EFNs with latent dimensions as small as 1 perform similarly to ordinary EFNs with higher latent dimension. This small latent dimension allows for the internal representation to be directly visualized and interpreted, which in turn enables the learned internal jet representation to be extracted in closed form. Published by the American Physical Society2024more » « less
-
To maximize the discovery potential of high-energy colliders, experimental searches should be sensitive to unforeseen new physics scenarios. This goal has motivated the use of machine learning for unsupervised anomaly detection. In this paper, we introduce a new anomaly detection strategy called : factorized observables for regressing conditional expectations. Our approach is based on the inductive bias of factorization, which is the idea that the physics governing different energy scales can be treated as approximately independent. Assuming factorization holds separately for signal and background processes, the appearance of nontrivial correlations between low- and high-energy observables is a robust indicator of new physics. Under the most restrictive form of factorization, a machine-learned model trained to identify such correlations will in fact converge to the optimal new physics classifier. We test on a benchmark anomaly detection task for the Large Hadron Collider involving collimated sprays of particles called jets. By teasing out correlations between the kinematics and substructure of jets, our method can reliably extract percent-level signal fractions. This strategy for uncovering new physics adds to the growing toolbox of anomaly detection methods for collider physics with a complementary set of assumptions. Published by the American Physical Society2024more » « less
-
We present PAPERCLIP (Proposal Abstracts Provide an Effective Representation for Contrastive Language-Image Pre-training), a method which associates astronomical observations imaged by telescopes with natural language using a neural network model. The model is fine-tuned from a pre-trained Contrastive Language–Image Pre-training (CLIP) model using successful observing proposal abstracts and corresponding downstream observations, with the abstracts optionally summarized via guided generation using large language models (LLMs). Using observations from the Hubble Space Telescope (HST) as an example, we show that the fine-tuned model embodies a meaningful joint representation between observations and natural language through quantitative evaluation as well as tests targeting image retrieval (i.e., finding the most relevant observations using natural language queries). and description retrieval (i.e., querying for astrophysical object classes and use cases most relevant to a given observation). Our study demonstrates the potential for using generalist foundation models rather than task-specifc models for interacting with astronomical data by leveraging text as an interface.more » « less
-
Infrared and collinear (IRC) safety has long been used a proxy for robustness when developing new jet substructure observables. This guiding philosophy has been carried into the deep learning era, where IRC-safe neural networks have been used for many jet studies. For graph-based neural networks, the most straightforward way to achieve IRC safety is to weight particle inputs by their energies. However, energy-weighting by itself does not guarantee that perturbative calculations of machine-learned observables will enjoy small nonperturbative corrections. In this paper, we demonstrate the sensitivity of IRC-safe networks to nonperturbative effects, by training an energy flow network (EFN) to maximize its sensitivity to hadronization. We then show how to construct Lipschitz energy flow networks ( -EFNs), which are both IRC safe and relatively insensitive to nonperturbative corrections. We demonstrate the performance of -EFNs on generated samples of quark and gluon jets, and showcase fascinating differences between the learned latent representations of EFNs and -EFNs. Published by the American Physical Society2024more » « less
-
A<sc>bstract</sc> By quantifying the distance between two collider events, one can triangulate a metric space and reframe collider data analysis as computational geometry. One popular geometric approach is to first represent events as an energy flow on an idealized celestial sphere and then define the metric in terms of optimal transport in two dimensions. In this paper, we advocate for representing events in terms of a spectral function that encodes pairwise particle angles and products of particle energies, which enables a metric distance defined in terms of one-dimensional optimal transport. This approach has the advantage of automatically incorporating obvious isometries of the data, like rotations about the colliding beam axis. It also facilitates first-principles calculations, since there are simple closed-form expressions for optimal transport in one dimension. Up to isometries and event sets of measure zero, the spectral representation is unique, so the metric on the space of spectral functions is a metric on the space of events. At lowest order in perturbation theory in electron-positron collisions, our metric is simply the summed squared invariant masses of the two event hemispheres. Going to higher orders, we present predictions for the distribution of metric distances between jets in fixed-order and resummed perturbation theory as well as in parton-shower generators. Finally, we speculate on whether the spectral approach could furnish a useful metric on the space of quantum field theories.more » « less
An official website of the United States government

Full Text Available